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Abstract 

In the last few years, a bewildering variety of methods/software packages that use linear mixed models to account 
for sample relatedness on the basis of genome-wide genomic information have been proposed. We compared 
these approaches as implemented in the programs EMMAX, FaST-LMM, Gemma, and GenABEL (FASTA/GRAMMAR- 
Gamma) on the Genetic Analysis Workshop 18 data. All methods performed quite similarly and were successful in 
reducing the genomic control inflation factor to reasonable levels, particularly when the mean values of the 
observations were used, although more variation was observed when data from each time point were used 
individually. From a practical point of view, we conclude that it makes little difference to the results which 
method/software package is used, and the user can make the choice of package on the basis of personal taste or 
computational speed/convenience. 



Background 

A number of different methods/software packages have 
been proposed in the last few years that implement linear 
mixed-model approaches to account for population struc- 
ture and relatedness among samples in genome-wide asso- 
ciation studies (GWAS), but no detailed comparisons 
among them have been made before our effort. Indeed, 
when a new method/package is developed, it is often quite 
unclear whether or how it differs substantially from those 
already available. To address this question, we explored 
the performance of various implementations of such 
methods in the longitudinal Genetic Analysis Workshop 
18 (GAW18) data set. 

Methods 

We analyzed the GAW18 GWAS data [1] using the real 
phenotypes and the first set of simulated phenotypes. 
This analysis was performed without knowledge of the 
underlying simulating model. The genotype data were 
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cleaned using standard procedures [2] . This resulted in 4 
individuals being excluded because of their total lack of 
genotype data, and another individual being excluded 
because of outlying ethnicity (Chinese [CHB] or Japanese 
[JPT]), leaving 954 individuals whose genotype data were 
used. We removed 43,987 monomorphic or low-fre- 
quency (minor allele frequency [MAF] <1%) single- 
nucleotide polymorphisms (SNPs), 109 SNPs with miss- 
ing rate above 10% (this criterion took into account the 
apparently high missing rate in some SNPs likely to be 
caused by the differences in genotyping technology used 
in the samples), and 1 SNP that failed Hardy- Weinberg 
equilibrium testing in the control founder population. 
A total of 427,952 SNPs were retained for analysis. 

We conducted linear regression of the real and simu- 
lated systolic blood pressure and simulated diastolic 
blood pressure at each time point regressed on age, med- 
ication, and smoking status. For the real diastolic blood 
pressure-which, as could be physiologically expected, 
seemed to have a nonlinear relationship with age-we 
used a quadratic regression, including age and age 
squared as predictors. The phenotype data from all indi- 
viduals were used for these regressions. Residuals from 
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these regressions in subjects who also had genotype data 
were then used for the genome-wide analyses. 

Genome-wide association analyses, adjusting for famil- 
ial relatedness using genomic data, were performed using 
a variety of linear mixed model approaches. All 
approaches attempt to fit the model Y=P+Q+s, where Y= 
iyv •••> is a vector of responses on n subjects; X= {xn^) 
is the n X K matrix of predictor values for variables to be 



modeled as fixed effects (including covariates and geno- 
types at any SNPs currently under test); P=(Pi, ... Pk)^ 
are regression coefficients (to be estimated) representing 
the linear effects of the predictors on the response; Q are 
random effects, Q~N(0,2ag^<I>), and 8 are random errors, 
8~N(0,Oe^^, where a^^ and a^^ are parameters (to be esti- 
mated) representing the genetic and environmental com- 
ponents of variance respectively; O is the n x n matrix of 
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Figure 1 Q-Q plots and genomic inflation factors for different methods. These were calculated for each phenotype (real diastolic blood 
pressure [DBP], real systolic blood pressure [SBP], simulated DBP, and simulated SBP), using either longitudinal ("long") or average ('mean") 
residuals. EM_BN, EMMAX using Balding-Nichols matrix; EMJBS, EMMAX using IBS matrix; FLM_C, FaST-LMM using standard covariance matrix; 
FLM_R, FaST-LMM using realized relationship matrix; GA_FA, GenABEIVFASTA; GA_GRG, GenABEIVGRAMMAR-Gamma; GMA_C, Gemma using 
centralized covariance matrix; GMA_S, Gemma using standardized covariance matrix. The diagonal line represents the identity line in each panel. 
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pairwise kinship coefficients; and / is the n x n identity 
matrix. The approaches vary with respect to precise 
details of the calculation of kinship or "relatedness" and 
with respect to whether an exact method or a fast 
approximation is used (for more details, see descriptions 
in references [3-9]). In each case we used a subset of 
21,153 SNPs to perform the relatedness calculations, 
namely SNPs with MAP >0.4, <5% missing data, and 



"pruned" to be in approximate linkage equilibrium via 
the PLINK command "-indep 50 5 2". In analyses of 
other data sets we have found little difference between 
results when using such a pruned set of SNPs for calcu- 
lating relatedness and when using the full set of SNPs 
(data not shown). 

The methods considered were: (a) EMM AX [3], which 
implements 2 methods for relatedness calculations: one 
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Figure 2 Comparison of -iogio p values at eacfi SNP calculated using different methods. The upper triangles show the values based on 
mean residuals, while the lower triangles show the values calculated using longitudinal data. DBP, diastolic blood pressure; EM_BN, ElVlMAX 
using Balding-Nichols matrix; EMJBS, EMMAX using IBS matrix; FLM_C, FaST-LMM using standard covariance matrix; FLM_R, FaST-LMM using 
realized relationship matrix; GA_FA, GenABEIVFASTA; GA_GRG, GenABEIVGRAMMAR-Gamma; GMA_C, Gemma using centralized covariance matrix; 
GMA_S, Gemma using standardized covariance matrix; SBP, systolic blood pressure. 
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based on identity-by-state (IBS) sharing and one based on 
the Balding-Nichols method [4]; (b) FaST-LMM [5], 
which also implements 2 methods to adjust for related- 
ness: one using a standard covariance matrix and one 
using the realized relationship matrix; (c) the polygenic/ 
mmscore functions in GenABEL [6], which implement 
the FASTA method [7]; (d) the polygenic/grammar func- 
tions in GenABEL, which implement the GRAMMAR- 
Gamma approximation [8]; and (e) Gemma [9], which 
uses an efficient exact method. Simple linear regression 
without any relatedness adjustment was also performed 
in FaST-LMM. All analyses were performed using both 
the residual from each individual observation (modeled 
without regard to its true longitudinal nature, or longitu- 
dinal) and the mean of the residuals for each subject, or 
mean. Genomic inflation factors (k) were calculated as 
proposed by Devlin and Roeder [10]. We also assessed 
the genomic inflation factors for unadjusted and 
Cochran- Armitage trend tests of hypertension status at 
each time point as calculated using PLINK [11]. 



Results and discussion 

Figure 1 shows the Q-Q plots and genomic inflation fac- 
tors for different methods. It is well known that population 
substructure and relatedness will cause an inflated distri- 
bution of genome- wide association test statistics > 1.00) 
if not appropriately modeled. All methods performed rea- 
sonably well for the mean residuals, controlling the X to 
0.99 to 1.03. For longitudinal data, most methods also per- 
formed well, with X in the range of 0.95 to 1.05, except 
perhaps for GRAMMAR-Gamma, which achieved Xs of 
approximately 1.08 to 1.09 for the simulated phenotypes. 
However, even these values were much less inflated com- 
pared to the X values of 1.22 to 1.68 (mean) and 2.04 to 
3.41 (longitudinal) seen in the unadjusted analyses. The 
higher inflation in longitudinal analyses (even when 
adjusting for relatedness) could be expected from the fact 
that additional (nongenetic) within-subject correlation was 
not allowed for in these analyses; indeed, one could argue 
that this behavior is statistically the "correct" behavior, 
with GRAMMAR-Gamma (which gave the highest 
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Figure 3 A selection of Manhattan plots showing p values calculated using various methods DBP, diastolic blood pressure; EM_BN, 
EMIVIAX using Balding-Nichols matrix; FLM_R, FaST-LMM using realized relationship matrix; GA_FA, GenABEL/FASTA; SBP, systolic blood pressure. 
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inflation) showing the "most correct" behavior. Interest- 
ingly, EMMAX using the IBS matrix seemed to have the 
opposite behavior, for reasons we are currently unable to 
determine. 

For the analyses using hypertension status, the unad- 
justed genomic inflations were between 1.21 and 1.55 
for the Cochran-Armitage trend test and between 1.01 
and 1.27 for the %^ test. 

Figure 2 compares the individual -logio p values from 
different methods. Most methods gave highly concor- 
dant results, particularly EMMAX (BN) and Gemma, 
whereas the 2 GenABEL methods were similar but less 
concordant. This is analogous to findings on single- 
observation data by Zhou and Stephens [9]. FaST-LMM 
tended to perform slightly differently from the other 
methods at SNPs with lower significance, although the 
results overall were still quite similar. 

Figure 3 shows a selection of Manhattan plots. For 
each phenotype, the results from all methods were quite 
similar, although the longitudinal data tended to show 
stronger signals. No clearly significant SNP was found 
in any phenotype, which is not surprising given the rela- 
tively small size of the GAW18 data set, which is under- 
powered for detecting (at genome-wide levels of 
significance) anything other than strong genetic effects. 
The high concordance in significance levels (at any 
given SNP) achieved by the different software packages 
(see Figure 2) indicates that no package is substantially 
more powerful than another, as expected from the fact 
that all packages implement slightly different versions of 
essentially the same statistical model. 

Although the results from all packages considered 
here were similar, the implementations did vary in 
speed. All packages performed the analysis in reasonable 
time (less than 1 day) on our system. Precise timings 
will depend on the computer resources and architecture 
available, but as a rule of thumb we found FaST-LMM 
and GRAMMAR-Gamma to be the fastest (taking just a 
few hours), followed by EMMAX and Gemma, which 
took 12 to 16 hours, and GenABEL/FASTA, which took 
18 to 20 hours. 

Conclusions 

All methods performed well and results were similar, 
particularly at the most significant SNPs. We conclude 
that (at least for nonlongitudinal traits) it makes little 
difference to the results which method/software package 
is used, and the user can make the choice of package on 
the basis of personal taste, speed, or computational con- 
venience. For longitudinal traits (modeled without 
regard to their longitudinal nature) the slight differences 
seen between the methods would be an interesting topic 
for further investigation, but it is beyond the scope of 
the current article. 
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